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Abstract — As improved versions of successive cancellation (SC) 
decoding algorithm, successive cancellation list (SCL) decoding 
and successive cancellation stack (SCS) decoding are used to 
improve the finite-length performance of polar codes. Unified 
descriptions of SC, SCL and SCS decoding algorithms are given 
as path searching procedures on the code tree of polar codes. 
Combining the ideas of SCL and SCS, a new decoding algorithm 
named successive cancellation hybrid (SCH) is proposed, which 
can achieve a better trade-off between computational complex- 
ity and space complexity. Further, to reduce the complexity, 
a pruning technique is proposed to avoid unnecessary path 
searching operations. Performance and complexity analysis based 
on simulations show that, with proper configurations, all the three 
improved successive cancellation (ISC) decoding algorithms can 
have a performance very close to that of maximum-likelihood 
(ML) decoding with acceptable complexity. Moreover, with the 
help of the proposed pruning technique, the complexities of ISC 
decoders can be very close to that of SC decoder in the moderate 
and high signal-to-noise ratio (SNR) regime. 

Index Terms — Polar codes, successive cancellation decoding, 
code tree, tree pruning. 



I. Introduction 

POLAR codes, proposed by Ankan [1|, are proved to 
achieve the symmetric capacities of the binary-input 
discrete memoryless channels (B-DMCs). This capacity- 
achieving code family is based on a technique called channel 
polarization. By performing the channel splitting and channel 
combining operations on independent copies of a given B- 
DMC, a set of synthesized binary-input channels can be 
obtained. Let / (W) denote the symmetric capacity of a B- 
DMC W. It is proved in [1] that: with N = 2" uses of 
W, n — 1, 2, • • • , when N is large enough, it is possible 
to construct N synthesized channels such that N (1 — I (W)) 
of them are completely unreliable and NI (W) of them are 
noiseless. By transmitting free bits (called information bits) 
over the noiseless channels and transmitting a sequence of 
fixed bits (called frozen bits) over the others, polar codes 
can achieve the symmetric capacity under a successive can- 
cellation (SC) decoder with both encoding and decoding 
complexity O (NlogN). In 0, it is proved that the block 
error probability of polar code under SC decoding satisfies 
P (N, R) < 2~ N for any /? < | when code length N is large 
enough and code rate R < I (W). Furthermore, it was shown 
by Korada et al. [4| that the error exponent (3 can be arbitrarily 
close to 1 for large N with a general construction using larger 
kernel matrices than the 2x2 matrix proposed by Ankan. To 
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construct polar codes, the channel reliabilities can be calcu- 
lated efficiently using Bhattacharyya parameters for binary- 
input erasure channels (BECs) (H. But for channels other 
than BECs, density evolution is required [3]. More practical 
methods for calculating the channel reliabilities are discussed 
in and 0, and these techniques are extended to g-ary 
input channels 0. The channel polarization phenomenon is 
believed to be universal in many other applications, such as 
parallel communications [14| [18], coded modulation systems 
fl9l , multiple access communications |20| [21), source coding 
lfl5ll 11221 . information secrecy |23| |24| and other settings. 

Although polar codes have astonishing asymptotic perfor- 
mance, the finite-length performance of polar code under SC 
decoding is not satisfying. With the factor graph representation 
of polar codes, a belief propagation (BP) decoder is introduced 
by Arian in (T). And in lfl5ll . Hussami et. al. show that BP 
decoder significantly can outperform SC decoder, and point 
out that, for channels other than BEC, the schedule of message 
passing in BP plays an important role. And they also show that 
the performance of BP decoder can be further improved by 
utilization of overcomplete factor graph representations over 
BEC. Unfortunately, due to the sensitivity of BP decoder to 
message-passing schedule, this is not realized on other chan- 
nels. In |[T6l a linear programming (LP) decoder is introduced 
without any schedule, and also, by using the overcomplete 
representations can improve the performance of LP decoder. 
But LP decoder cannot work on channels other than BEC. 
Maximum likelihood (ML) decoders are implemented via 
Viterbi and BCJR algorithms on the codeword trellis of polar 
codes [ 17 1, but because of their high complexity, they can only 
work on very short code blocks. 

Successive cancellation (SC) decoding of polar codes es- 
sentially shares the same idea with the recursive decoding of 
RM codes [8|. Like the recursive decoders can be improved 
by using a list [9] or a stack [10|, SC can also be enhanced 
in the same way. 

As an improved version of SC, successive cancellation 
list (SCL) decoding algorithm is introduced to approach the 
performance of maximum likelihood (ML) decoder with an 
acceptable complexity ifTTI . fl2l . And later, an other improved 
decoding algorithm based on SC named successive cancel- 
lation stack (SCS) decoding algorithm is proposed whose 
computational complexity will decrease with the increasing 
of signal-to-noise ratio (SNR) and can be very close to that of 
the SC decoding in the high SNR regime lfL3l . Compared with 
SCL, SCS will have a much lower computational complexity. 
But it comes at the price of much larger space complexity and 
it will fail to work when the stack is too small. Combining 
the ideas of SCL and SCS, a new decoding algorithm named 
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successive cancellation hybrid (SCH) is proposed in this paper, 
and it can achieve a better trade-off between computational 
complexity and space complexity. In this paper, all the three 
improved SC decoding algorithms, SCL, SCS and SCH, are 
described under a unified manner of a path searching pro- 
cedure on the code tree. Further, to reduce the complexity, 
a pruning technique is proposed to avoid unnecessary path 
searching operations. 

The remainder of the paper is organized as follows. Section 
|ll| reviews the basics of polar coding and describes the SC 
decoding algorithm as a path searching procedure on a code 
tree using a posteriori probabilities (APPs) as metrics. Then 
the three improved successive cancellation (ISC) decoding 
algorithms and the pruning technique are introduced in section 
[rTl| Section [IV] provides the performance and complexity 
analysis based on the simulation results of polar codes under 
ISC decoders with different parameters. Finally, Section [V] 
concludes the paper. 

II. Preliminaries 

A. Notation Convention 

In this paper, we use blackboard bold letters, such as X 
and Y, to denote sets, and use |X| to denote the number of 
elements in X. We write the Cartesian product of X and Y as 
X x Y, and write the n-th Cartesian power of X as X". 

We use calligraphic characters, such as £ to denote a event. 
And let £ denote the event that £ is not happened. 

We use notation to denote a A-dimension 

vector (v 1 ,v 2 ,--- , Ujv) an d to denote a subvector 
(vi, • • ■ ,Vj-i,Vj) of Vi , 1 < i,j < A. Particularly 
when i > j, v\ is a vector with no elements in it and the 
empty vector is denoted by tj>. We write v^ to denote the 
subvector of with odd indices (ak : 1 < fc < A; fc is 
odd). Similarly, we write v^ e to denote the subvector of 
with even indices (dfc : 1 < fc < A; fc is even). For example, 
for v\, vl = (v 2 ,v 3 ), v\ Q = (vx,v 3 ) and v\ e = (v 2 ,v 4 ). 
Further, given a index set I, Vj denote the subvector of 
which consists of ViS with i £ I. 

Only square matrices are involved in this paper, and they 
are denoted by bold letters. The subscript of a matrix indicates 
its size, e.g. Fjy represents a A x A matrix F. We write the 
Kronecker product of two matrices F and G as F ® G, and 
write the n-th Kronecker power of F as F®". 

B. Polar Codes 

Let W : X ->• Y denote a B-DMC with input alphabet X and 
output alphabet y. Since the input is binary, X = {0, 1}. The 
channel transition probabilities are W (y\x), x £ X, y £ Y. 

For code length A = 2™, n = 1. 2, ■ • • , and information 
length K, i.e. code rate R = A/A, the polar coding over W 
proposed by Arikan can be described as follows: 

After channel combining and splitting operations on A in- 
dependent uses of W, we get A successive uses of synthesized 
binary input channels Wjf, i = 1, 2, • • • , A, with transition 
probabilities 
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and the source block are supposed to be uniformly dis- 
tributed in {0, 1} N . Let P e (WffJ denote the probability of 
maximum-likelihood (ML) decision error of one transmission 
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and is the module-2 addition. 

The reliabilities of polarized channels iw^H are usually 
measured by ([3j, and can be evaluated using Bhattacharyya 
parameters HI for binary erasure channels (BECs) or density 
evolution |3| for other channels. 

To transmit a binary message block of K bits, the K most 
reliable polarized channels j with indices i £ I are 

picked out for carrying these information bits; and transmit a 
fixed bit sequence called frozen bits over the others. The index 
set Is {1, 2, • • • , A} is called information set and |I| = K. 
And the complement set of I is called frozen set and is denoted 
by F. 

Alternatively, the polar coding can be described as follow: A 
binary source block which consists of K information bits 



and N—K frozen bits is mapped to a code block Xi via x" = 



Gjv- The matrix G 
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and Bjv is the bit-reversal permutation matrix. The binary 
channel x 1 ^ are then sent into channels which are obtained by 
A independent uses of W. 

C. Successive Cancellation Decoding 

As mentioned in [1], polar codes can be decoded by suc- 
cessive cancellation (SC) decoding algorithm. Let u 1 ^ denote 
the estimate of the source block u^. After receiving y^, the 
bits Ui are determined successively with index i from 1 to A 
in the following way: 
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1 otherwise 



The block error rate (BLER) of this SC decoding is upper 
bounded by 



Psc (A,I) <^P e (w 



4 ] 



(7) 



3 



This successive decoding can be represented as a path 
searching process on a code tree. For a polar code with code 
length N, the corresponding code tree T is a full binary tree. 
More specifically, T can be represented as a 2-tuple (V, E) 
where V and E denote the set of nodes and the set of edges 
respectively, |V| = 2 N+1 - 1, |E| = 2 N+1 - 2. The depth of 
a node v G V is the length of the path from the root to the 
node. The set of all nodes at a given depth d is denoted by Yd, 
d = 0, 1, • • • ,N. The root node has a depth of zero. All the 
edges e G E are partitioned into N levels E;, I = 1, 2, • • • ,N, 
such that the edges in E; incident with the nodes at depth 
I — 1 and the nodes at depth /. Except the nodes at the 7V-th 
depth Vat, each v G Yd has two descendants which belong to 
V<z+i, and the two corresponding edges are label as and 1 
respectively. The nodes v G Yn are called leaf nodes. Fig. [T] 
gives a simple example of code tree with N = 4. 

A i-length decoding path {ei, e^, ■ e^} consists of i edges, 
with ei G E^ i G {1, 2, • ■ • , N}. A vector v\ is used to depict 
the above decoding path, where Vi is corresponding to the 
binary label of edge e,. The reliability of a decoding path v\ 
can be measured using a posteriori probability 
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Fig. 1. An example of code tree for code length N = 4. The bold branches 
show a decoding path of SC with uf = 0011. 



exploring. And the gray ones are those which are not visited 
during the searching process. In the example, four times of 
calculations of equation ([H} are required, one for each level. 
However, the decoding path is not guaranteed to be the most 
probable one. As shown in the example, the one labeled 1000 
has the largest probability of all the A^-length paths, but it 
failed in the competition at the first level. 

For further practical considerations, we use the logarithmic 
APPs as the path metrics: 



(8) 



The APPs can be regarded as normalized versions of the 
channel transition probabilities defined in ([TJ. The two kinds 
of probabilities are related by a multiplicative factor 2P(y^). 
By eliminating the factor, the APPs take values in a more 
stable range, and all the decoding paths with the same lengths 
have the sum probability equals to one, i.e. 
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For i el, the path metric can be recursively calculated as 
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This property will help in understanding the path searching 
procedure in the code tree and is more suitable for hardware 
implementation. 

Similar to the recursive expressions of ([T]l given in (TJ, the 
APPs can also be calculated recursively. For any n > 0, N = where function max* (a, b) = max (a, b) + log (l + e~ |a ~ b| ) 
2", 1 < i < N, is the Jacobian logarithm and v'i e = {v 2 , V4, ■ ■ ■ , v 2 i — 0}, 
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Then, the decision function of SC in (|6| is rewritten as 



1 otherwise 



(15) 



SC decoding can be seen as a greedy search algorithm on 
the code tree. In each level, only the one of two edges with 
larger probability is selected for further processing. 

The red bold edges in Fig. [T] shows the SC decoding 
path. The number written next to each of the nodes provides 
the APP metric of the decoding path from the root to that 
node. The nodes which are extended during the SC decoding 
procedure are represented by the numbered circles, and the 
corresponding numbers indicate the processing order. The 
black circles represent the nodes which are visited (whose 
APP metric is calculated) but failed in competition for further 



Using the space-efficient structure [ 1 1 1 to implement a SC 
decoder, the time and space complexity are 0(N log N) and 
O(N) respectively. 

III. Improved Successive Cancellation Decoding 

The performance of SC is limited by the bit-by-bit decoding 
strategy. Since whenever a bit is wrongly determined, there is 
no chance to correct it in the future decoding procedure. 

Theoretically, the performance of the maximum a posteriori 
probability (MAP) decoding (or equivalently ML decoding, 
since the inputs are assumed to be uniformly distributed) can 
be achieved by traversing all the A^-length decoding paths in 
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Fig. 2. An example of SCL decoding with searching width L = 2. 
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Fig. 3. An example of SCS decoding. 



the code tree T. But this brute-force traverse takes exponential 
complexity and is difficult to be implemented. 

Two improved decoding algorithms called successive can- 
cellation list (SCL) decoding and successive cancellation stack 
(SCS) are proposed in HQ fl21 and [13). Both of these two 
algorithms allow more than one edge to be explored in each 
level of the code tree. During the SCL(SCS) decoding, a 
bunch of candidate paths will be obtained and stored in a 
list(stack). Since for every single candidate path, the metric 
calculations and bit determinations are still performed bit- 
by-bit successively, SCL and SCS can be regarded as two 
improved versions of conventional SC decoding. 

In this section, we will restate SCL and SCS under a unified 
framework with the help of APP metrics and the code tree 
representations. Then to overcome the own shortages of SCL 
and SCS, a new hybrid decoding algorithm named successive 
cancellation hybrid (SCH) decoding is proposed. Furthermore, 
to reducing the computational complexities, we propose a 
pruning technique to eliminate the unnecessary calculations 
during the path searching procedure on the code tree. 



A. Successive Cancellation List Decoding 

As an enhanced version of SC, the successive cancellation 
list (SCL) decoder ifTTI lfT2l searches level-by-level on the 
code tree, which is just the same with SC. However, unlike 
SC where only one path is reserved after processing at each 
level, SCL allows at most L candidate paths to be further 
explored at the next level. 

SCL can be regarded as a breadth-first searching on the 
code tree T with a searching width L. At each level, SCL 
doubles the number of candidates by appending a bit or a 
bit 1 to each of the candidate paths, and then selects at most L 
ones with largest metrics and stores them in a list for further 
processing at the next level. Finally, when reaching the leaf 
nodes, the binary labels corresponding to the edges in path 
{ei, e2, • • • , ejv} which has the largest metric in the list, are 
assigned to the estimated source vector u^. 

Let LrW denotes the set of candidate paths corresponding to 
the level-i of code tree in a SCL decoder. The L^s are stored 
and updated in a list structure. The SCL decoding algorithm 
with searching width L, denoted by SCL(L), can be described 
as follows: 

(A.l) Initialization. A null path is included in the initial list 
and its metric is set to zero, i.e. L' ) = {</)}, M (<p) = 0. 



(A.2) Expansion. At the i-th level of the code tree, the num- 
ber of candidate paths in the list are doubled by concatenating 
new bits Vi taking values of and 1 respectively, that is, 



L w = {(t4-U) 



(i-i) 



{o,i}} 



(16) 



for each v\ € L.W, the corresponding path metric(s) are 



updated according to (12i, (13i and (14i. 

(A. 3) Competition. If the number of paths in the list after 
(A.2) is no more than L, just skip this step; otherwise, reserve 
the L paths with the largest metrics and delete the others. 

(A.4) Determination. Repeat (A.2) and (A.3) until level-A r 
is reached. Then, the decoder outputs the estimated source 
vector = v±, where is the binary labels of the path 
with the largest metric in the list. 

Fig. [2] gives a simple example of the tree searching under 
SCL decoding with L = 2. Compare with SC in Fig. [T] SCL 
find the most probable path 1000. But the times of metric 
computations is increased from four to seven. 

SCL maintains L decoding paths simultaneously, each path 
consumes a O(N) space, the space complexity of SCL then is 
0(LN). During the decoding process at each level, each of the 
L candidates is copied once and extended to two new paths, 
these copy operations require O(LN) computations. More- 
over, since the code tree has N levels, a direct implementation 
of SCL decoder will take 0{LN 2 ) computations. In ifTTll . a 
so called "lazy copy" technique based on the memory sharing 
structure among the candidate paths is introduced to reduce 
this copy complexity. Therefore, the SCL decoder can be 
implemented with computational complexity 0(LN log N). 

B. Successive Cancellation Stack Decoding 



Note that, the path metric (12i of a certain decoding path 



with binary label vector v\ will not be smaller than that of any 
of its descendants, i.e. for any v{ € G {0, 

and i < j < N, 



Mf (vi\y?)>M$> (v{ 



(17) 



Hence, if the metric of a iV-length decoding path is larger 
than that of another path with length I < N, it must also be 
larger than the metric of any of the A^-length descendant path 
of the latter. So rather than waiting after processing at each 
level, we can keep on searching along the single candidate path 
until its metric is no longer the largest. Once a A^-length path 
is found with the largest metric among all the candidate paths, 
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its binary label vector is simply output as the final estimation, 
the unnecessary computations for extending other paths are 
then saved. 

The SCS decoder lfl3l uses a ordered stack S to store the 
candidate paths and tries to find the optimal estimation by 
searching along the best candidate in the stack. Whenever the 
top path in the stack which has the largest path metric reaches 
length N, the decoding process stops and outputs this path. 
Unlike the candidate paths in the list of SCL which always 
have the same length, the candidates in the stack of SCS have 
difference lengths. 

Let D denote the maximal the stack § in SCS decoder. A 
little different from the original SCS in |13|, an additional 
parameter L is introduced to limit the number of extending 
paths with certain length in decoding process. A counting 
vector = (ci, C2, Cjv) is used to record the number of the 
popping paths with specific length, i.e. Cj means the number 
of popped paths with length-i during the decoding process. 

The SCS decoding algorithm with the searching width L 
and the maximal stack depth D, denoted by SCS (L, D), is 
summarized as follows: 

(B.l) Initialization: Push the null path into stack and set 
the corresponding metric M ((/)) = 0. Initialize the counting 
vector with all-zeros, and the instantaneous stack depth 
\S\ = 1. 

(B.2) Popping: Pop the path v\ 1 from the top of stack, and 
if the path is not null, set c,_i = c;_i + 1. 

(B.3) Expansion: If Vi is a frozen bit, i.e. i 6 F, simply 
extend the path to v\ — (u£ , itj); otherwise, if is an 
information bit, extend current path to («i , 0) and (y\ , l). 
Then calculate path metric(s) by l ] 12} , ( fT3] l and ( |14) . 

(B.4) Pushing: For information bit d { , if |S| > D — 2, delete 
the path from the bottom of the stack. Then push the two 
extended paths into the stack. Otherwise, for frozen bit Vi, 
push the path v\ — (v l {~ , 0) into stack directly. 

(B.5) Competition: If c^i — L, delete all the paths with 
length less than or equal to i — 1 from the stack §. 

(B.6) Sorting: Resort paths in the stack from top to bottom 
in descending metrics. 

(B.7) Determination: If the top path in the stack reaches 
to the leaf node of the code tree, pop it from the stack. The 
decoding algorithm stops and outputs u 1 ^ = v± as the decision 
sequence. Otherwise go back and execute step (B.2). 

Fig. [3] gives a simple example of the tree searching under 
SCS. Compare with SCL in Pig. [2j SCS can also find the most 
probable path 1000 with two fewer metric computations. 

Similar to SC and SCL, the space efficient structure and 
"lazy copy" technique are applied in the implementation of 
SCS decoders. The time and space complexity of SCS are 
0(LN log N) and O(DN) respectively. However, under the 
same searching width L, the actual computations of SCS(L) 
will be much fewer than that of SCL(L) when workding in 
the moderate or high SNR regime. 

C. Hybrid SCL and SCS 

Compared with SCL, SCS decoding can save a lot of 
unnecessary computations especially when working in the high 




Fig. 4. Mode transition diagram of SCH decoding. 

signal-to-noise (SNR) regime [13|. However, the stack used in 
SCS consumes a much larger space than SCL. Theoretically, 
to prevent performance deterioration, the stack depth D needs 
to be as large as LN, thus the space complexity will becomes 
0(LN 2 ). Fortunately, as shown in |fl3l , a much smaller stack- 
depth D is enough for moderate and high SNR regimes. But 
the most appropriate value of D is relied on the specific SNR 
and is hard to determine. 

In this paper, a new hybrid decoding algorithm called 
successive cancellation hybrid (SCH) is proposed. SCH, as 
the name suggests, is a hybrid of SCL and SCS. SCH has two 
working modes called on-going and waiting. At first, SCH 
decoder works on the on-going mode, it searches along the 
best candidate path using a ordered stack just the same as that 
SCS does. But when the stack is about to be full, SCH stops 
searching forward and switches to the waiting mode. Under 
the waiting mode, SCH turns to extend the shortest path in 
the stack until all the candidate paths in the stack have the 
same length. The processing under waiting mode is somewhat 
similar to SCL and it decreases the number of paths in stack. 
Then, SCH switches back to the on-going mode again. Fig. [4] 
gives a graphic illustration. This decoding procedure goes on 
until an A^-length path appears at the top of the stack. 

The SCH algorithm with the searching width L, the max- 
imal stack depth D, denoted by SCH (L, D), is summarized 
as follows: 

(C.l) Initialization: Push the null path into stack § and set 
the corresponding metric M ((f)) = 0. Initialize the counting 
vector Ci with all-zeros, and the instantaneous stack depth 
|S| = 1. The working mode flag / mo< j e is set to 0, where 
denote the on-going mode and 1 denote the waiting mode. 

(C.2) Popping: When f mo d e = 0, pop the path from 
the top of stack; else when f mo de = 1, pop the path v 1 ^ 1 with 
the shortest path length in the stack. Then, if the popped path 
is not null, i.e. ^ <f>, set Cj_i = c$_i + 1. 

(C.3) Expansion: If Vi is a frozen bit, i.e. i e F, simply 
extend the path to v\ — (v\~ , itj); otherwise, if Vi is an 
information bit, i.e. i 6 I, extend current path to (vj~ , 0) 
and (u* -1 , l). Then calculate path metric(s) by ( |l2| , ( foj and 

(C.4) Pushing: For information (frozen) bit Vi, push the new 
two paths (one path) into the stack. 

(C.5) Competition: If Cj_i = L, delete all the paths with 
length less than or equal to i — 1 from the stack §. 

(C.6) Mode Switching: When f mode = and D - |S| < 
2L - 1, switch f m ode = 1; when f mode = 1 and all the 
candidate pathes in the stack have equal lengths, fmode = 1; 
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Fig. 5. The L candidates divide the probability space into L partitions. 

(C.7) Sorting: Resort paths in the stack from top to bottom 
in descending metrics. 

(C.8) Determination: If the top path in the stack reaches 
to the leaf node of the code tree, pop it from the stack. The 
decoding algorithm stops and outputs u 1 ^ = v 1 ^ as the decision 
sequence. Otherwise go back and execute step (C.2). 

The time and space complexity of SCH are 0(LN log N) 
and 0(DN) respectively. The actual computations of SCH 
decoding is less than that of SCL but is usually more than 
that of SCS. 

For SCH decoding, since no path is dropped when the 
stack is about to be full, the performance will not affected 
by D. However, when decoder stays in the waiting mode, 
unnecessary computations will be taken. And the smaller the 
maximum stack depth is D, the more likely the decoder will 
switch to the waiting mode. So, the computational complexity 
grows with the decreasing of D. To have enough space for 
waiting mode, the minimum value of D is 2L. Particularly, 
when D = 2L, SCH(L,L>) is equivalent to SCL(i); and when 
D > LN, SCH(L,D) is equivalent to SCS(L,L>). 

D. Pruning Technique 

During the path searching on the code tree, the candidate 
paths with too small metrics and their descendants will hardly 
have the chance to be reserved in the future process. In 
this subsection, we propose a pruning technique to reduce 
the computational complexity of the improved successive 
cancellation decoding algorithms. 

An additional vector is used to record the pruning 
reference for each level, where ai is the largest metric of all 
the traversed i-length decoding paths on the code tree. More 
specifically, for SCL decoding, 

a,= max M®{v{\y?) (18) 

And equivalently, for SCS and SCH, a, t is set to the metric of 
the first i-length path popped off the stack. 

We introduce a new parameter called probability ratio 
threshold r. During the processing at level-i on the code 
tree, a i-length path with metric smaller than a, — log(r) is 
dropped directly. Recall that the path metrics are defined as 
the logarithmic APPs ( |12) . Therefore, the pruned paths are 
those whose APPs 

PfKK)<^ M (19) 



Intuitively, the correct path will possibly be dropped in this 
pruning operation. In the following part of this subsection, 
an upperbound of the additional performance deterioration 
brought by r is derived and a conservative configuration of 
t is given. 

Hereafter, SCL, SCS and SCH are collectively referred to as 
improved successive cancellation (ISC) decoding algorithms. 
The block error event of polar code with information set I 
under ISC decoding is defined as 

£ = { «, fif, yf ) GX N xX N xY N : UI ^ uj} (20) 

By introducing pruning operations, the error events can be 
classified into two kinds. The first kind is the correct path is 
not lost until the final decision phase, i.e. the correct path is 
contained in the final list(or stack) but does not have the largest 
metric. The second kind is the correct path is lost before the 
decision step. So, the block error rate (BLER) of ISC can be 
decomposed as 

P IS c(N, I, L, D, t) = P(£\C)P(C) + P(C) (21) 

where C means the correct path loss. 

The event C can be further decomposed as 

z — 'zO 

where C; is the event that the correct path is not lost until the 
processing at the z-th level. 

There are three kinds of event which will lead to path loss 
at the i-th level. The first is brought by the searching width 
limitation, i.e. the correct path is excluded from the L best 
paths in i-th decoding step, and is denoted by C{. The second 
is brought by the maximum probability ratio limitation, i.e. 
the metric of the correct path is much smaller than that of 
the best one, and is denoted by %■ The third is brought by 
the maximum stack depth limitation, which only exist in the 
SCS decoding that the correct path is abandoned when the 
path length equals i and the metric is much smaller than the 
maximum one at that moment, and this event is denoted by 
T>i. Then 

P(C t ) = P(d) + P(T l \C l )P(C l ) + P(pi\ZiTi)P(ZiTi) (23) 

For SCL, SCH decoding or SCS with a large enough stack 
depth D, P(Vi\CiTi) = 0. 

The additional BLER performance deterioration brought by 
pruning is 

E ieI P (^) = E !eI P («) P ^) (24) 

During the processing on the code tree T , we will have at 
most L paths at level-z with APPs {pi,p2, ■ ■ ■ ,pl} which is 
calculated by ([SJ, and 

L 

q = Y,P^ 1 (2 5 ) 

3=1 

Without loss of generality, we assume that pi > p 2 > ■ ■ ■ > 
Pl- By the assumption that the one of these paths is the correct 
path, the L probability divided the whole probability space into 
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Fig. 7. BLER under different code rate 



L parts as shown in Fig. [5] The event of correct path loss in 
the pruning processing at the z-th level has a probability 

P{T l \Z. l )= l V Pj (26) 

q ' 

* jl .'I.--- ,L},p 3 < Pl 

For each of these eliminated paths, the corresponding prob- 
ability 

Pj < — < - (27) 

T T 

where j G {2, 3, • • • , L}. So we have 

L — 1 



P(Ti\&) < 



(28) 



The additional error probability brought by r is upper 
bounder by 

V < J2 P(Tii£i) < K{L ~ l) (29) 

Given a tolerable performance deterioration P to /, the value 
of t can be determined as 



K(L-1) 



Pi 



(30) 



tol 



In most cases, since the upperbound in ( |29| l is very loose, 
the accrual performance deterioration is usually far less than 



Ptoi- The configuration of r in (30 1 is very conservative 



IV. Simulation Results 

In this section, the performance and complexity of the 
improved successive cancellation (ISC) decoding algorithms 
will be discussed. 

To simplify the complexity evaluation of polar decoding, 
we measure the average computational complexity in terms of 
the number of metric recursive operations, which are defined 
in ( 13 i or \\A\ . For example, the computational complexity of 
SC decoder is N log N = 1024 x 10 « 10 4 . 

Fig. [6] gives the simulation results with code length N set as 
1024 and 2048, and the code rate R = 1/2. And Fig. [JJ shows 
the BLER performances with code rate R set as 1/3 and 2/3, 
and the code length N is fixed to 1024. The lowerbounds 
of BLER performance under maximum-likelihood (ML) de- 
coding are obtained by performing SCL(32) decoding and 



SC N-1024 R=0.5 
SCL(32) N=1024 R-0.5 
SCS(32, 1024) N-1024 R=0.5 
SCH(32, 512) N=1024 R=0.5 
Pruned SCH(32, 512) N-1024 R=0.5 
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Fig. 8. Complexity under different decoding algorithms 



counting the number of times the decoded codeword is more 
likely than the transmitted one. The probability ratio threshold 
r for pruning operation is set by (30 1 with P to i = 10~ 5 . As 



shown in the figures, under proper configurations, all the three 
decoding algorithms can achieve the performance very close 
to that of ML decoding. 

The average computational complexities under different 
decoding algorithms with code length N = 1024 and code rate 
R = 1/2 are shown in Fig. [8] We can see that the complexity 
of SCH is not monotonically decreasing with the increasing 
of SNR. This is because the switching between the two 
working modes is relied on the certain code construction and 
searching procedures. However, SCH always has a much lower 
computational complexity than that of SCL. Although it needs 
more computations than SCS, SCH occupies less memory 
space without any deterioration in performance. In fact, under 
some specific configurations, SCH can be equivalent to the 
other two decoding algorithms: when D = 2L, SCH(L, 2L) 
is equivalent to SCL(L); and when D is very large, SCH(L, 
D) is equivalent to SCS(L, D); Therefore, SCH can achieve a 
better trade-off between computational complexity and space 
complexity. Furthermore, by applying the pruning technique 



introduced in section III-D the computational complexity can 
be significantly reduced and very close to that of SC in the 
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Fig. 10. Complexity under different L 
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Fig. 12. Complexity under different D 



moderate and high SNR regime. 

Compared with SC, ISC decoding algorithms introduce 
three more parameters: the searching width L, the maximum 
stack depth D and probability ratio threshold for pruning r. In 
the following part of this section, we will analysis the impacts 
on performance and complexity of this three parameters one- 
by-one. 



A. On Different Searching Width L 

Fig. [9] gives the performance comparisons under SCL de- 
coding with different L. The code length and code rate are set 
as N — 1024 and R = 0.5 respectively. The searching width 
L varies from 1 (equivalent to SC) to 64. 

Note that, SCL(L) is equivalent to SCH(L, 2L) and SCS(X, 
D) with a large enough D. The affects brought by different 
L in SCL decoding are the same with that in SCS and SCH. 

The larger the searching width is, the less probable to lose 
the correct path, i.e. P(Ci) in (23 i is a decreasing function 
of L. But according to the results depicted in Fig. [10] the 



computational complexity is approximately proportional to L. 
As shown in Fig. [9] L = 32 is good enough for N = 1024 
and R = 0.5. 



B. On Different Stack Depth D 

For polar codes under SCS decoding, a too small value 
of the maximum stack depth D will lead to significant de- 
terioration on performance. As shown in Fig. 11 D need 



to be larger than 1024 for SCS decoding. But for SCH, 
the different configurations of D no longer affect the BLER 
performance but the computational complexity. As shown in 



Fig. 12 the computational complexity of SCH is decreasing 



with the increasing of D. Although it needs more computations 
than SCS, SCH occupies less memory space without any 
deterioration in performance. Compared with SCL, SCH has 
much lower computational complexity and only require a little 
more memory space. In fact, under some specific configu- 
rations, SCH can be equivalent to the other two decoding 
algorithms: when D = 2L, SCH(X, 2L) is equivalent to 
SCL(i); and when D is very large, SCH(L, D) is equivalent 
to SCS(L, D); Hence, SCH can achieve a better trade-off 
between computational complexity and space complexity. 

C. On Different Pruning Ratio t 

Fig. [13] and Fig. [14] give simulations of polar codes with 
code length N = 1024 code rate R = 0.5 over binary-input 
additive Gaussian noise channels (BAWGNCs). The codes are 
decoded by SCH decoders with L = 32, D = 256 and r varies 
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Fig. 14. Complexity under different r 



from 1 to 10 8 . As shown in the figures, the computational 
complexity will be reduced when the increasing of t, while 
the BLER performance will be deteriorated with a too small r. 
Larger values of r such as 10 4 ~ 10 8 will introduce little dete- 
rioration in performance, but will lead to larger complexities. 
However, when the codes work in a moderate signal-to-noise 
ratio (SNR) regime such as 2.5dB where the BLER is less 
than 10~ 3 , the computational complexity differences of SC 
and SCH decoding under different r in the simulated regime 



tends to negligible as shown in Fig. 14 



V. Conclusion 

The successive cancellation (SC) decoding algorithm of 
polar codes and its improved versions, successive cancellation 
list (SCL) and successive cancellation stack (SCS) are restated 
as path searching procedures on the code tree of polar codes. 
Combining the ideas of SCL and SCS, a new decoding 
algorithm named successive cancellation hybrid (SCH) is pro- 
posed, which can achieve a better trade-off between computa- 
tional complexity and space complexity. To avoid unnecessary 
path searching, a pruning technique which is suitable for all 
improved successive cancellation (ISC) decoders is proposed. 
Performance and complexity analysis based on simulations 
show that, with the help of the pruning technique, all the 



ISC decoders can have a performance very close to that of 
maximum-likelihood (ML) decoding, and the computational 
complexities can be very close to that of SC in the moderate 
and high signal-to-noise ratio (SNR) regime. 
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